The COVID-19 pandemic and Asian American employment

Recent studies have documented the disparate impact of the COVID-19 pandemic on labor market outcomes for different racial groups. This paper adds to this literature by documenting that the employment of Asian Americans—in particular those with no college education—has been especially hard hit by the economic crisis associated with the onset of the pandemic. This can only partly be explained by differences in demographics, local market conditions, and job characteristics, and it also cannot be entirely explained by possible different selection into education levels across ethnic groups. The burden on Asian Americans is primarily borne by those who are not US-born. Supplementary Information The online version contains supplementary material available at 10.1007/s00181-022-02306-5.

The table shows the fraction of individuals in each group that report to be "at work" in a given month. The underlying data is from IDEMS CPS and covers individuals aged 25-65 over the period between April 2017 and March 2021. The averages are calculated using sampling weights.

Robustness
This section contains some robustness checks that appeared in the unpublished version of the paper.
The results presented in the published version are based on estimation of linear probability models. Below, we re-estimate one of the models using a fixed effects logit approach. The results suggest that our conclusions are not driven by this linearity assumption.
We also compare the effects of the pandemic on different ethnic groups to the effects of the Great Recession. We find that in contrast to the pandemic, there was no notable difference in the effects of the 2008 economic downturn on Asians-Americans and Whites. This is perhaps not surprising since the COVID-19 pandemic affected very different sectors of the economy than previous downturns.

Logit or Linear Probability Model?
The outcome "working" is binary. This leads to the question of whether the fixed effects linear probability model will adequately capture its relationship with the explanatory variables. We therefore investigate whether our results above are likely to be biased by using a linear probability model. This is less straightforward than it might seem because all the models above include interactions which can be thought of as group-specific fixed effects. For example, the estimation leading to Tables 2 to 5 in the paper absorbs 663 interactions between state and month, while the regressions reported in the later tables have up to 5,475 occupation-month interactions. Including this many parameters in a nonlinear model can not only be computationally burdensome, but can also distort the statistical properties of the estimators of all the parameters. This is known as the incidental parameters problem. 1 The traditional solution to this well-known problem is to estimate the parameters (other than the fixed effects) by conditional maximum likelihood. This idea dates back to Rasch (1960Rasch ( , 1961. Unfortunately, this conditional likelihood approach to eliminating the fixed effects is computationally infeasible in our application. We therefore use a modified version of the conditional likelihood approach to check whether our results are sensitive to the use of the linear probability model. Our approach is based on the same insight that leads to the conditional likelihood estimator mentioned above. Specifically, let i index a group of observations (corresponding to a specific value of the interaction in question), and let j denote an observation within group i. We assume that the dependent variable, y ij (working), for observation j in group i is independent conditional on the explanatory variables, x ij , and on a group-specific effect, α i . We also assume that the distribution of each y ij is where Λ (·) denotes the logistic cumulative distribution function and J i denotes the number of observations in group i.
In this case, the distribution of (y i1 , . . . , y iJi ) conditional on (x i1 , . . . , x iJi ) and on Ji j=1 where The traditional approach is to estimate β by maximizing a conditional likelihood function based on (2). This, however, would be extremely computationally intensive. For example, our sample of women with a high school degree or less contains a group (state-month combination) with 36,071 observations of whom 18,408 reported working. This means that for this group, the number of terms in the denominator of (2) is 36,071 18,408 , which is of the order of magnitude of 10 10,852 . This makes estimation based on the likelihood function in (2) infeasible. To overcome this, we note that for any two individuals in group i, j 1 and j 2 , This can be used to form a likelihood function for a pair of observations within each group, and combining these pairs would lead to a pseudo-likelihood function. In principle, one could use all pairs of (j 1 , j 2 ) in a group, but this can again be computationally demanding. We therefore pair each observation for whom y ij1 = 1 with a randomly chosen observation, ij 2 , with y ij2 = 0 in the same group. We also pair each observation for whom y ij1 = 0 with a randomly chosen observation with y ij2 = 1 in the same group. The estimator of β is then based on maximizing the pseudo log-likelihood function Statistically, this estimator is an extremum estimator (M-estimator) based on the n groups.
Assuming independence across groups, consistency of the estimator therefore follows from the fact that each term corresponds to a conditional likelihood, and its asymptotic distribution follows from the standard expression for extremum estimators. See, for example, Amemiya (1985). In particular, the estimator will be asymptotically normally distributed with an asymptotic variance that can be estimated by sample analogs. Table 3 displays the estimated coefficient in a fixed effects logit version of the model in Table 6 of the paper. 2 The relative magnitudes and the statistical significance of the coefficients in Table  3 are not dramatically different from those of the linear probability model. On the other hand, the coefficients from the logit model are not directly comparable to those of the linear probability model. In a cross sectional setting, one would therefore convert the logit parameters to average marginal effects. This is not possible with a fixed effects logit approach because the average marginal effect depends on the true conditional probability that y ij is 1, which in turn depends on the distribution of the fixed effects conditional on the explanatory variables. To overcome this, we consider a hypothetical heterogeneous population of East Asians in which the probability of working prior to the pandemic is uniformly distributed between 50% and 90% 3 . The estimated average effects of the pandemic on the probability of working (relative to the effect for Whites with the same education and gender) for this population are declines in the probability of working by 19.1, 14.3, 1.2, 18.9, 11.0, and 1.0 percentage points across the six combinations of education and gender. These are quite similar to the results for the linear probability model in Table 6 of the paper. As a result, we do not further investigate alternatives to the linear probability model.

Comparison to the 2008 Recession
It is difficult to know the mechanisms behind the dramatic impact of the pandemic on Asians, especially East Asians, with lower educational attainment. One possibility is that this is a typical feature of economic downturns. In order to investigate this possibility, we estimate the same model as in Table 2 of the paper using data from 2006 through 2011, with the "Crisis" variable defined as a dummy variable for the crisis years 2008 through 2011. The results are presented in Table 4. 4 They suggest that there is generally no differential in the impact of the recession between Asians and Whites. This is in sharp contrast to the COVID-19 crisis. The maximum T-statistic of the 18 coefficients that measure the differential Asian-versus-White impact of the crisis is around 2.5, and a joint test of significance across all 6 combinations of education and gender yields a chi-square test statistic of around 14. Compared to a chi-square distribution with 18 degrees of freedom, this is statistically insignificant.

Response to Issues Brought Up In the Editorial Process
This section addresses two specific issues that were brought up in the editorial process.
The first issue is whether the results are sensitive to the specification of the dependence of age as being quadratic and whether the number of children matters (as opposed the presence of any children). To address this, we re-estimated the model reported in Table 6 with a fourth degree polynomial in age and with dummy variables for the number of children and the number of children less than 5. Of the 48 coefficients reported in Table 6 of the paper, 35 were unchanged while twelve coefficients changed by one in the third decimal point. One estimate changed by two Supplementary Material The table show the coefficient in the logit model with group specific effects. The dependent variable is working, and the sample is restricted to those who had a job in the previous month. The control variables not reported are: age, age 2 , marital status, presence of a child, presence of a child under 5, interactions between calendar quarter and ethnicity as well as their main effects, interactions between ethnicity and Cr4 and Cr5, fixed effects for each combination of state and month starting in April 2020, and fixed effects for occupation (using the 2010 Census occupation coding scheme) and each combination of lagged occupation and month starting in April 2020. Cr2, Cr3, Cr4, and Cr5 refer to the second quarter of 2020 through the first quarter of 2021. The data is from IPUMS CPS and covers individuals aged 25-65 over the period between April 2017 and March 2021. Robust standard errors are clustered at the household level.
in the third decimal. We therefore conclude that the results are not sensitive to the functional form for age or for the number of children.
The second issue is whether it is more relevant to control for individual or occupation fixed effects. In the paper, we include fixed effects for the interactions between occupation and monthdummies (after the onset of the crisis) because we expected this to be economically more relevant than individual specific fixed effects. It is likely that observations for a given individual in different time periods will be correlated, and the results reported in the paper attempt to account for this by calculating standard errors that are clustered at the household level. To investigate the relative importance of the two types of fixed effects, we re-estimated the model in Table 6 of the paper using no fixed effects and using individual fixed effects (without sampling weights because this can be difficult to deal with in a traditional panel data setting if the weights vary over time).
Supplementary Material For comparison, we also re-estimated the model in Table 6 of the paper without weights. The results are given in the three tables below.
It is clear from the estimation results that controlling for the interactions between occupation and the crisis-month-dummies makes a larger difference than controlling for individual fixed effects. (See, for example, the numbers displayed in bold.) The dependent variable is working, and the sample is restricted to those who had a job in the previous month. The control variables not reported are: age, age 2 , marital status, presence of a child, presence of a child under 5, interactions between calendar quarter and ethnicity as well as their main effects, fixed effects for each combination of state and month starting in April 2020, interactions between ethnicity and Cr4 and Cr5, and fixed effects for occupation (using the 2010 Census occupation coding scheme) and each combination of lagged occupation and month starting in April 2020. Cr2, Cr3, Cr4, and Cr5 refer to the second quarter of 2020 through the first quarter of 2021. The data is from IPUMS CPS and covers individuals aged 25-65 over the period between April 2017 and March 2021. Robust standard errors are clustered at the household level.